FMU: Fast Mining of Probabilistic Frequent Itemsets in Uncertain Data Streams
نویسندگان
چکیده
Discovering Probabilistic Frequent Itemsets (PFI) in uncertain data is very challenging since algorithms designed for deterministic data are not applicable in this context. The problem is even more difficult for uncertain data streams where massive frequent updates need be taken into account while respecting data stream constraints. In this paper, we propose FMU (Fast Mining of Uncertain data streams), the first solution for exact PFI mining in data streams with sliding windows. FMU allows updating the frequentness probability of an itemset whenever a transaction is added or removed from the observation window. Using these update operations, we are able to extract PFI in sliding windows with very low response times. Furthermore, our method is exact, meaning that we are able to discover the exact probabilistic frequentness distribution function for any monitored itemset, at any time. We implemented FMU and conducted an extensive experimental evaluation over synthetic and real-world data sets; the results illustrate its efficiency.
منابع مشابه
Mining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows
Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...
متن کاملFast and Exact Mining of Probabilistic Data Streams
Discovering Probabilistic Frequent Itemsets (PFI) is very challenging since algorithms designed for deterministic data are not applicable in probabilistic data. The problem is even more difficult for probabilistic data streams where massive frequent updates need to be taken into account while respecting data stream constraints. In this paper, we propose FEMP (Fast and Exact Mining of Probabilis...
متن کاملEfficient mining of temporal emerging itemsets from data streams
In this paper, we propose a new method, namely EFI-Mine, for mining temporal emerging frequent itemsets from data streams efficiently and effectively. The temporal emerging frequent itemsets are those that are infrequent in the current time window of data stream but have high potential to become frequent in the subsequent time windows. Discovery of emerging frequent itemsets is an important pro...
متن کاملIncremental updates of closed frequent itemsets over continuous data streams
Online mining of closed frequent itemsets over streaming data is one of the most important issues in mining data streams. In this paper, we propose an efficient one-pass algorithm, NewMoment to maintain the set of closed frequent itemsets in data streams with a transaction-sensitive sliding window. An effective bit-sequence representation of items is used in the proposed algorithm to reduce the...
متن کاملDELAY-CFIM: A Sliding Window Based Method on Mining Closed Frequent Itemsets over High-Speed Data Streams
Closed frequent itemset mining plays an essential role in data stream mining. It could be used in business decisions, basket analysis, etc. Most methods for mining closed frequent itemsets store the streamlined information in compact data structure when data is generated. Whenever a query is submitted, it outputs all closed frequent itemsets. However, the online processing of existing approache...
متن کامل